A Flexible Fuzzy Expert System for Fuzzy Duplicate Elimination in Data Cleaning
نویسندگان
چکیده
Data cleaning deals with the detection and removal of errors and inconsistencies in data, gathered from distributed sources. This process is essential for drawing correct conclusions from data in decision support systems. Eliminating fuzzy duplicate records is a fundamental part of the data cleaning process. The vagueness and uncertainty involved in detecting fuzzy duplicates make it a niche, for applying fuzzy reasoning. Although uncertainty algebras like fuzzy logic are known, their applicability to the problem of duplicate elimination has remained unexplored and unclear, until today. In this paper, a novel and flexible fuzzy expert system for detection and elimination of fuzzy duplicates in the process of data cleaning is devised, which circumvents the repetitive and inconvenient task of hard-coding. Some of the crucial advantages of this approach are its flexibility, ease of use, extendibility, fast development time and efficient run time, when used in various information systems.
منابع مشابه
Eliminating Fuzzy Duplicates in Data Warehouses
1 Work done while visiting Microsoft Research Abstract The duplicate elimination problem of detecting multiple tuples, which describe the same real world entity, is an important data cleaning problem. Previous domain independent solutions to this problem relied on standard textual similarity functions (e.g., edit distance, cosine metric) between multi-attribute tuples. However, such approaches ...
متن کاملA Fuzzy Expert System for Predicting the Performance of Switched Reluctance Motor
In this paper a fuzzy expert system for predicting the performance of a switched reluctance motor has been developed. The design vector consists of design parameters, and output performance variables are efficiency and torque ripple. An accurate analysis program based on Improved Magnetic Equivalent Circuit (IMEC) method has been used to generate the input-output data. These input-output data i...
متن کاملA Fuzzy Expert System & Neuro-Fuzzy System Using Soft Computing For Gestational Diabetes Mellitus Diagnosis
Gestational diabetes mellitus (GDM) is a kind of diabetes that requires persistent medical care in patient self management education to prevent acute complications. One of the common and main problems in diagnosis of the diabetes is the weakness in its initial stages of the illness. This paper intends to propose an expert system in order to diagnose the risk of GDM by using FIS model. The knowl...
متن کاملA Fuzzy Expert System & Neuro-Fuzzy System Using Soft Computing For Gestational Diabetes Mellitus Diagnosis
Gestational diabetes mellitus (GDM) is a kind of diabetes that requires persistent medical care in patient self management education to prevent acute complications. One of the common and main problems in diagnosis of the diabetes is the weakness in its initial stages of the illness. This paper intends to propose an expert system in order to diagnose the risk of GDM by using FIS model. The knowl...
متن کاملA Flexible Link Radar Control Based on Type-2 Fuzzy Systems
An adaptive neuro fuzzy inference system based on interval Gaussian type-2 fuzzy sets in the antecedent part and Gaussian type-1 fuzzy sets as coefficients of linear combination of input variables in the consequent part is presented in this paper. The capability of the proposed method (we named ANFIS2) for function approximation and dynamical system identification is remarkable. The structure o...
متن کامل